Mining Parsing Results for Lexical Correction: Toward a Complete Correction Process of Wide-Coverage Lexicons
نویسندگان
چکیده
The coverage of a parser depends mostly on the quality of the underlying grammar and lexicon. The development of a lexicon both complete and accurate is an intricate and demanding task. We introduce a automatic process for detecting missing, incomplete and erroneous entries in a morphological and syntactic lexicon, and for suggesting corrections hypotheses for these entries. The detection of dubious lexical entries is tackled by two different techniques; the first one is based on a specific statistical model, the other one benefits from information provided by a part-of-speech tagger. The generation of correction hypotheses for dubious lexical entries is achieved by studying which modifications could improve the successful parse rate of sentences in which they occur. This process brings together various techniques based on taggers, parsers and statistical models. We report on its application for improving a large-coverage morphological and syntacic French lexicon, the Lefff .
منابع مشابه
Error Mining for Wide-Coverage Grammar Engineering
Parsing systems which rely on hand-coded linguistic descriptions can only perform adequately in as far as these descriptions are correct and complete. The paper describes an error mining technique to discover problems in hand-coded linguistic descriptions for parsing such as grammars and lexicons. By analysing parse results for very large unannotated corpora, the technique discovers missing, in...
متن کاملCorrecting Syntactic Annotation Errors Based on Tree Mining
This paper provides a new method to correct annotation errors in a treebank. The previous error correction method constructs a pseudo parallel corpus where incorrect partial parse trees are paired with correct ones, and extracts error correction rules from the parallel corpus. By applying these rules to a treebank, the method corrects errors. However, this method does not achieve wide coverage ...
متن کاملCorrecting Errors in a Treebank Based on Tree Mining
This paper provides a new method to correct annotation errors in a treebank. The previous error correction method constructs a pseudo parallel corpus where incorrect partial parse trees are paired with correct ones, and extracts error correction rules from the parallel corpus. By applying these rules to a treebank, the method corrects errors. However, this method does not achieve wide coverage ...
متن کاملA Modified Schimazek’s F-abrasiveness Factor for Evaluating Abrasiveness of Andesite Rocks in Rock Sawing Process
One of the most crucial factors involved in the optimum design and cost estimation of rock sawing process is the rock abrasivity that could result in a significant cost increase. Various methods including direct and indirect tests have been introduced in order to measure rock abrasivity. The Schimazek’s F-abrasiveness factor ( ) is one of the most common indices to assess rock abrasivity. is t...
متن کاملRole of smile correction in mineral detection on hyperion data
This work aims to extract the mineralogical constituents of the Lahroud Hyperion scene situated in the NW of Iran. Like the other push-broom sensors, Hyperion images suffer from spectral distortions, namely the smile effect. The corresponding spectral curvature is defined as an across-track wavelength shift from the nominal central wavelength, and alters the pixel spectra. The common “column me...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007